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Abstract 

In this paper, we propose a new method for estimating the conditional risk-neutral density 
(RND) directly from a cross-section of put option bid-ask quotes. More precisely, we propose 
to view the RND recovery problem as an inverse problem. We first show that it is possible to 
define restricted put and call operators that admit a singular value decomposition (SVD), which 
we compute explicitly. We subsequently show that this new framework allows us to devise a simple 
and fast quadratic programming method to recover the smoothest RND whose corresponding put 
prices lie inside the bid-ask quotes. This method is termed the spectral recovery method (SRM). 
Interestingly, the SVD of the restricted put and call operators sheds some new light on the RND 
recovery problem. The SRM improves on other RND recovery methods in the sense that 1) it is 
fast and simple to implement since it requires solution of a single quadratic program, while being 
fully nonparametric; 2) it takes the bid ask quotes as sole input and does not require any sort of 
calibration, smoothing or preprocessing of the data; 3) it is robust to the paucity of price quotes; 
4) it returns the smoothest density giving rise to prices that lie inside the bid ask quotes. The 
estimated RND is therefore as well-behaved as can be; 5) it returns a closed form estimate of the 
RND on the interval [0, B] of the positive real line, where B is a positive constant that can be 
chosen arbitrarily. We thus obtain both the middle part of the RND together with its full left tail 
and part of its right tail. We confront this method to both real and simulated data and observe 
that it fares well in practice. The SRM is thus found to be a promising alternative to other RND 
recovery methods. 

KEY WORDS: Risk- neutral density; Nonparametric estimation; Singular value decomposition; Spectral 
analysis; Quadratic programming. 
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1 Introduction 



1.1 The setting 

Over the last four decades, the no-arbitrage assumption has proved to be a fruitful starting point that 
paved the way for the elaboration of a rich theoretical framework for derivatives pricing known today 
as arbitrage pricing theory. Among its numerous achievements, the arbitrage pricing theory has set 
forth two fundamental theorems. The First Fundamental Theorem of Asset Pricing (see [21, p. 72]) 
proves that a market is arbitrage- free if and only if there exists a measure Q equivalent to the historical 
(or statistical) measure P, which turns the underlying price process into a martingale. Q is therefore 
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referred to as a martingale measure. The Second Fundamental Theorem of Asset Pricing (see [21, 
p. 73]) proves in turn that this martingale measure is unique if and only if the market is complete (see 
[21, p. 300] for terminology). Let us denote by S T the positive valued price of the underlying at a 
deterministic future date r and by tt(S t ) the payoff of a contingent claim maturing at time t. Let us 
moreover denote by q the marginal density of S T under Q with respect to the Lebesgue measure on 
the positive real line, assuming that it exists. As initially proved in [10], the arbitrage price of this 
derivative security writes as its discounted expected payoff under Q, that is, 

e- rT EQTT(S T ) = e~ rT [ tt(x)Q(S t G dx) = e~ rr [ ir(x)q(x)dx, 

Jx>0 Jx>0 

where r stands for the continuously compounded risk-free rate. It is a widely acknowledged fact that 
financial markets are incomplete, if only due to the presence of jumps in the underlying price process. 
In such a setting, and as described above, there exist possibly very many qs, and therefore, very many 
corresponding systems of arbitrage-free prices. Let us denote by the corresponding set of valid 
densities q. The elements q of J2 are most often referred to as risk-neutral densities (RNDs) and we 
will stick to this terminology in the sequel. 

RNDs are of crucial interest for Central Banks and, in fact, most institutions and people concerned 
with financial markets since they represent the market sentiment about a given underlying price pro- 
cess at a future point in time (see [3]). They are also of crucial interest to the financial derivatives 
industry since the knowledge of q allows to price new derivative securities in an arbitrage-free way with 
respect to traded ones. For these reasons, the literature related to risk-neutral density estimation is 
very extensive, the bulk of it dating back to the late 90's and early 2k's. It is not our purpose here to 
present an exhaustive review of this literature. Excellent and up-to-date reviews can in fact be found 
in [14, 17]. Older but still relevant ones can be found in [9, 3]. 

Among derivative securities, call and put options play a very particular role since they are actively 
traded in the market and thus believed to be efficiently priced. Let us recall that a call of strike £ and 
maturity r gives its holder the right to buy the underlying security at maturity time t at price £. It is an 
insurance against a rise in the price of the underlying. Its payoff writes tt(S t , £) = 6(S T , £) = (S T — £) + , 
where we have written (x) + = max(x,0) for x £ K. Conversely, a put option gives the right to sell 
the underlying security. It is an insurance against a fall in the underlying price and its payoff writes 
6*(S T ,^) = (£ — S T ) + . Here and in what follows, we denote the strike price by £ and not by k, which 
will stand for a running index in N. 

According to the celebrated Breeden-Litzenberger formula, the second derivative of put and call prices 
with respect to their strike price both equal the discounted RND e~ TT q (see [6]). Therefore, if a 
continuum of put or call prices were available in the market, we would have direct access to the RND 
by the latter formula. However, this is not the case and only a few strike prices around the forward 
price are quoted and actively traded at each maturity date. Depending on the market, we overall 
reckon from 5 to 50 quotes at a given maturity date r. To complicate the matter even more, quotes do 
not appear as a single price. Dealers quote in fact a bid price, at which they offer to buy the security, 
and an ask price, at which they offer to sell the security. The difference between both prices is referred 
to as the bid-ask spread. For an interesting insight into the nature of option quotes and sources of 
error in them, the reader is referred to, say, [16, p. 786]. 

1.2 The problem and brief literature review 

As detailed above, if traded puts and calls at a given maturity r are arbitrage free, they must write 
as their expected discounted payoff with respect to a single RND q drawn from the set £t. Given the 
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paucity of quoted option prices at a given maturity r and the presence of a bid-ask spread, it is clear 
that many RNDs could in fact be hidden behind quoted option prices. Therefore, the RND quest is 
not that much about estimating the true RND that is used by the market for pricing purpose, since the 
nature of the quotes does not allow to identify it uniquely It is rather more about recovering a valid 
RND, meaning an actual density function, to be chosen according to a criterion typically related to its 
smoothness or information content. Historically, three main routes have been used to recover a RND 
from quoted option prices: parametric methods, nonparametric methods and models of the underlying 
price process. Each of them have their pros and cons. Parametric methods are well adapted to small 
data sets and always recover a density. However, they constrain the RND to belong to a given paramet- 
ric family. On the other hand, models of the underlying price process have been the first great success 
of arbitrage pricing theory with the celebrated geometric Brownian motion (see [4, 20]). However, 
the limitation of the log-normal distribution is now widely acknowledged and no satisfying stochastic 
process has yet been proposed that both reproduce accurately the dynamics of the underlying price 
process and be analytically tractable. Nonparametric methods circumvent both of these problems in 
the sense that they do not require any stringent assumption on the process generating the data (they 
are model-free) and can recover all possible densities. As a main drawback, these methods are often 
data intensive. 

Let us briefly come back on some contributions to the nonparametric literature which are relevant to 
the present paper. We can classify nonparametric methods as follows. 

• The expansion methods. It includes the Edgeworth (see [19]) and cumulant expansions (see 
[22]), which allow to estimate a finite number of RND cumulants. It also includes orthonormal 
basis methods such as Hermite polynomials (see [1]), which rely on well known Hilbert space 
techniques and give access to the middle part of the RND. 

• The kernel regression methods. As a recent example, [2] have introduced a shape constrained 
local polynomial estimator of the RND. Notice that it performs estimation on the average quoted 
prices (that is, the average of the bid- ask quotes) and requires therefore to pre-process them in 
order to make them arbitrage-free. Moreover, the returned RND depends on the kernel chosen 
and it is not clear how it relates to the other valid RNDs in term of information content or 
smoothness. 

• The maximum entropy method. It is introduced in [8, 23], where the RND q is obtained via 
the maximization of an entropy criterion. According to [9, p. 19], this method often gives bumpy 
(multimodals) estimates since it imposes no smoothness restriction on the estimated density. In 
addition, it is claimed in [18, p. 1620], that this method presents convergence issues. 

• Other methods, which do not belong to any of the three categories above. Among them, we 
can refer to the positive convolution approximation (PCA) of [5], In practice, it fits a finite 
(but large) convex linear combination of normal densities to the average quoted put prices and 
approximates the RND by the weights of the linear combination. It thus presents similarities with 
[18], since it ultimately fits a discrete set of probabilities to the average quoted prices. We can 
also refer to the smoothed implied volatility smile method (SML) as in [14]. This method uses 
the Black-Scholes formula as a non-linear transform. It consists in fitting a polynomial through 
the implied volatilities obtained from average quoted prices, and using the continuum of option 
prices obtained in that way to get the RND via the Breeden-Litzenberger formula. [14] refines 
this method by taking the bid-ask quotes into account at the implied volatility fit stage. The 
SML method gives access to the middle part of the RND. [14] proposes in addition a method 
for appending generalized extreme value (GEV) tail distributions to it. The SML method is 
cumbersome and can seem a bit odd since it requires going from price space to implied volatility 
space, back and forth. It is claimed that it is outperformed in term of accuracy and stability by 
simpler parametric methods in [7]. 
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1.3 Our results 



In this paper, we propose to view the RND recovery problem as an inverse problem. We first show 
that it is possible to define restricted put and call operators that admit a singular value decomposition 
(SVD), which we compute explicitly. We subsequently show that this new framework allows to devise 
a simple and fast quadratic programming method to recover the smoothest RND that is consistent 
with market bid-ask quotes. 



To be more precise, let us denote by I the segment [0, B] of the positive real line. We define the 
restricted put and call operators, denoted by 7* and 7, from L2X into itself (see eq. (2.1) and eq. (2.2) 
below) and show that they are conjugates of one another. We prove that the resulting self-adjoint 
operator 7*7 is compact. As a consequence of the spectral theorem (see [15]), 7* admits a singular 
value decomposition with positive decreasing singular values. We prove that the corresponding singular 
bases are complete in L2X (see Theorem 3.1, item 3)) and compute them explicitly together with their 
singular values (see Figure 1). To fix notations, we will write ((f k ) k >o an d (ipk)k>o the two orthonormal 
families of L2X such that 7*792^ = X\ip k , 77*V'fc = ^fc^fc' where (X k ) k >o 1S a positive decreasing sequence 
of singular values. Precisely, we obtain explicitly, 

B\ 2 



where 



Pk = - + kir + (-i) k p k , ken, 



and, for all k E N, (3 k is the smallest positive solution of the following fixed point equation in u, 

exp(vr/2 + kir + {-l) k u) 



\k \ _ 1 + COs(u) 



sin (u) 



Interestingly, the positive sequence {(3k) decreases exponentially fast toward zero as detailed in Lemma 
6.3. Therefore, the sequence of singular values (X k ) k >o tends asymptotically toward zero at a rate 
of order k~ 2 . The RND recovery problem is therefore said to be mildly ill-posed with a degree of 
ill-posedness equal to 2 (see [13, p. 40]). Furthermore, for all £ E X, we obtain, 



(PkiO = (ak,ie p ^ B + a k , 2 e-^ B ) + (a kj3 cos{p k t/B) + a kA sm{p k ^B) 
MO = (a k ,ie pk ^ B + a K2 e-P^ B ) - ( a k>3 cos(p k t/B) + o M sin(^/S) 



where the coefficients a ki i, i = 1, . . . , 4 are such that, 

a ~ 1 ( ~ 1)fc 

VBeP^ + i-l)^ 

a k ,2 = (-l) k e pk a k ^ ' 



Ofc,3 
Ofc,4 



v/Bl + (-l)*e-ft' 
1 



'B 

1 1 - (-l) k e~P k 
y/Bl + (-l) k e-Pk' 



Based on this new framework, we propose a spectral approach to RND recovery. It is fully nonpara- 
metric and can recover the restriction of any density to the interval X. To that end, we notice that 
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the singular bases functions ipk and ipk are in fact oscillations hk,2 at frequency Pk/B carried by the 
exponential trend h^i (see eq. (6.2) and eq. (6.1) for notations). Conveniently, smooth densities are 
therefore essentially captured by low singular spaces. The idea of recovering the smoothest density 
among the valid ones was initially suggested in [18]. Subsequently, [9] correctly pointed out that the 
smoothness criterion can be debated as it is difficult to give it an economic or even information theo- 
retic meaning. Our spectral approach sheds some new light on this issue and makes it clear that the 
smoothness criterion is justified by the fact that the restricted call and put operators behave as low- 
pass frequency filters. It is therefore illusory to look for high frequency information about the RND in 
a set of quoted options prices, since this information has been drastically attenuated by the operator. 
The smoothness criterion arises therefore as a by-product of the spectral nature of the restricted put 
and call operators and might well not be an intrinsic property of the true RND. Interestingly, smooth 
densities are also easier to recover by nonparametric means. 

In what follows, we exploit the rich framework offered by the SVD of the restricted put and call oper- 
ators to recover the smoothest RND that is compatible with market quotes. As detailed in eq. (7.1) 
below, the discounted restricted put operator coincides with the put price function (as a function of 
the strike) on I. We therefore propose to recover the smoothest RND such that its image by the 
discounted restricted put operator e~ rTr y* lies in-between the bid-ask quotes (see eq. (7.1)). Conve- 
niently, the singular bases present the property of being image of one another by second derivation 
modulo a multiplication by the corresponding singular value of 7* (see Theorem 6.1). This allows 
us to characterize the smoothness of the estimated RND directly in term of a quadratic form of the 
coefficients of the estimated put price function, which depends on the singular values of the restricted 
put operator (see Proposition 7.1). This crucial feature allows to recover the smoothest RND as the 
solution of a simple quadratic program, which takes the bid ask quotes as sole input. Our estimation 
method improves on existing ones in several ways, which we sum up here. 

• It is fast and simple to implement since it only requires solution of a single quadratic program, 
while being fully nonparametric. 

• It is robust to the paucity of price quotes since the smaller the number of quotes, the less 
constrained the quadratic program and thus the easier to solve. 

• It takes the bid ask quotes as sole input and does not require any sort of smoothing or prepro- 
cessing of the data. 

• It returns the smoothest density giving rise to price quotes that lie inside the bid ask quotes. 
The estimated RND is therefore as well-behaved as can be. 

• It returns a closed form estimate of the RND on X. We thus obtain both the middle part of 
the RND together with its left tail and part of its right tail. Interestingly, the left tail contains 
crucial information about market sentiments relative to a potential forthcoming market crash. 

It is noteworthy that the singular vectors ipo and tpQ corresponding to the largest singular value Ao of 7 
and 7* look themselves very much like cross sections of put and call prices, respectively (see Figure 1). 
In that sense, they will be able to capture the bulk of the shape of a cross section of option prices, 
while the subsequent singular vectors will add corrections to this general behavior. This is a crucial 
feature of this SVD that leads us to think that the singular bases of the restricted pricing operators are 
appropriate tools to recover the RND q. Interestingly, the performance of our quadratic programming 
algorithm on real data is indeed quite convincing (see Section 8 for details). 

Readers interested in appending a full right tail to this estimated RND are referred to [14], who pro- 
poses a simple method for smooth pasting of parametric GEV tail distributions to an estimated RND. 

Here is the paper layout. We introduce the restricted call and put operators, 7 and 7*, and operators 
derived therefrom in Section 2. We detail the properties of operators 7*7 and 77* on the one hand, 
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and 7 and 7* on the other hand, in Section 3 and Section 4, respectively. Other results relative to 
these four operators are reported in Section 5. Section 6 gives explicit expressions for the (A&), (<fk) 
and {ipkj- The spectral recovery method (SRM) is detailed in Section 7. Finally, we run a simulation 
study in Section 8. An Appendix regroups some additional useful results. 



2 Definitions and setting 

Let us define the restricted call operator on the interval X = [0, B] as the operator 7 from L2X into 
L2X such that, 

(2.1) (if)(0 = J e(Z,x)f(x)dx, ZeljeUX, 

It is a trivial fact that 7/ belongs indeed to L2X. Let's denote by (., .) the usual scalar product on L2X 
and by ||.||l 2 x the associated norm. Now, it is enough to notice that for all £, x EX, x)\ < B and 
apply Cauchy-Schwartz inequality to obtain, 

hf\\l 2 x < j x <% ^dx\e{^x)\\f{x)\j < B'\\f\\t 2l < 00. 

The adjoint operator 7* of 7 is such that, for all f,g€ L2X, 

<77, <?> = </, 73) 

duf(u) / dx0(u,x)g(x) 
1 Jx 

dxg{x) I du9(u,x)f(u). 
x Jx 

Hence 

(2.2) 7*/(0 = J 0*(£,x)f{x)dx, £eX,/eL 2 x, 

So that 7* is nothing but the restricted put operator on the interval X. In particular, we can write 

(2.3) 7*7/(0 = J ^,x)f(x)dx, £eX,/eL 2 X, 

(2.4) 777(0 = J MS, x)f(x)dx, CelJe L 2 x, 



where 



0i(£,ar) = / du9*(£,u)9(u,x) 



X 



p i>t,/\x 

/ du(£ — u) + {x — u) + = I du{^ — u){x — u) 
Jx Jo 

£x(£ A x) - (£ + x)(f A x) 2 /2 + (£ A x) 3 /3, 
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and 



Mtx) = [ du6(Z,u)6*(u,x) 



Ji 




I du(u — £) + (u — x) + = / du(u — £)(u — x) 

X J£Vx 



= £x(B -(Vi)-(( + x)(B-(V xf/2 + (B-£V xf/3. 



Let us now turn to the detailed inspection of these operators. 

3 Results relative to 7*7 and 77* 

Let us denote by 1Z(k) the range of an operator k of L2X and by J\T(k) its null space (see [12, p. 23]). 
Obviously both 7*7 and 77* are self-adjoint. This translates into the fact that their kernels are 
symmetric (meaning x) = £)). In addition, both $1 and $2 are continuous on the bounded 
square X x X. Therefore, the associated operators are compact (see [12, Ex. 4.8.4, p. 172]). As such, 
they verify the spectral theorem (see [12, Th. 4.10.1, 4.10.2, p.187-189]). 

THEOREM 3.1. Given the operators 7*7 and 77* defined in eq. (2.3) and eq. (2.4) above, we have 
the following results. 

1) The operators 7*7 and 77* are compact and self-adjoint. As such, they admit countable families 
of orthonormal eigenvectors (<fk) and (tpk) associated to the same positive decreasing sequence of 
eigenvalues A?, which are complete in 7^(7*7) and 7^(77*), respectively. 

2) Besides, we have 



where C 4 X stands for the set of four times differentiable functions on X. 
3) Furthermore, the orthonormal families (<fk) and (tpk) are complete in L2X. In other words, they 
are both orthonormal bases of L2X. In fact, we can write 



where 7^.(7*7) stands for the closure 0/7^.(7*7) in L2X (see [12, p. 16]) and Span{(pk, k G N} for the 
set of (potentially infinite) linear combinations of elements ip^. 
4) Therefore, 7*7 and 77* are both invertible and admit the fourth order differential operator df as an 
inverse (see [12, p. 155] for terminology). More precisely, we have got 



^(7*7) c l 2 x n c 4 x. 
^(7*7) c l 2 x n c 4 x. 



L 2 X = ^(7*7) = Span{tp k , k G N}, 



= ft(77*) = Span{ip k , k G N}, 



3f 7*7/ = /, 
7*73f / = /, 



V/ G L 2 X, 
V/ G 7e( 7 *7), 



and idem for 77* . 
5) Finally, we have the following spectral decompositions, 




f G L 2 X, 




/ G L 2 X, 
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and 

k>0 
k>0 

Proof. As detailed above, 1) follows directly from the spectral theorem. 2) follows directly from the 
kernel representations in eq. (2.3) and eq. (2.4). It can also be seen from the fact that, for any / G L 2 Z, 
both 7/ and 7*/ are twice differentiable, which follows by simple inspection of eq. (2.1) and eq. (2.2). 
3) follows directly from Proposition 5.1 below. 4) is a direct consequence of Lemma 8.1 below. Finally, 
5) follows directly from 1) and 3). □ 



4 Results relative to 7 and 7* 

The following theorem details the properties of the restricted put and call operators. It builds upon 
Theorem 3.1 above. 

THEOREM 4.1. Given operators 7 and 7* defined in eq. (2.1) and eq. (2.2) above, we have the 
following results. 

1) Consider the sequence of positive decreasing singular values and singular vectors ((fk) and {i^k) 
defined in Theorem 3. 1 above. The restricted put and call operators 7* and 7 are such that, for all 
k>0, 

J<Pk = AfcV'fc, 7*V'fc = Afc^fc. 



2) Besides, we have 

n(rt*) c L 2 inc 2 i, 
^(7) c l 2 x n c 2 i, 

where C 2 T stands for the set of two times differentiable functions on T. 

3) In addition, we have L 2 Z = lZ(~y*) = ^(7). So that both 7 and 7* are invertible and admit the 
second order partial differential operator <9| as an inverse. In particular, we obtain 

(4.1) sf 7 /(0 = df 7 7(0 = f(0, V/ G h 2 l. 

So that the knowledge of^f or/and 7* / allows to recover f directly as their second derivative. This 
is nothing but the so-called Breeden-Litzenberger formula restricted to the interval I. 

4 ) We have furthermore the following spectral decompositions, 

f = ^(/, <Pk)Vk, f G L^, 

k>0 

7/ = ^2^k(f,^k)ipk, / G L 2 X, 

and 



k>0 



fc>0 

, y*f = ^2 X k(f,^k) i Pk, / G L 2 X. 



fc>0 
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5) Finally, we have a put-call parity on the interval that can be written as follows 

(7-7*)/(£) = mi(/)-£rno(/)> 
where we have defined fhk{f) '■= f x x k f(x)dx. 

Proof. The proof of 1) follows directly from [13, p. 37]. 2) follows by simple inspection of eq. (2.2) and 
eq. (2.1). The first part of 3) follows from the facts that K{-f) = TZ(jj*) and K(-f*) = ^(7*7) (see 
1) above) and Theorem 3.1, item 3). The second part of 3) follows partly from Lemma 8.1 below (see 
Appendix) and partly from the obvious fact that / = 7*<9|/ for all / G 7Z(^f*) (idem for 7). 4) follows 
directly from 1) and 3). Finally, 5) follows immediately from the following obvious computations, 

(7 -7*)/(0= 7/(0-77(0 

= J[e^,x)-e*^,x)]f( x )dx 

= J(x-S)f{x)dx 
= rni(f) - £mo(/)- 

□ 

We regroup other results relative to the above operators in the following section. 

5 Other results relative to 7*7, 77*, 7* and 7 

We prove here that both orthonormal families ((fk) an d (iftk) are complete in L 2 Z. Other interesting 
results are to be found in the Appendix. Some of them are purely technical, while some others are of 
more general interest. 

Proposition 5.1. We have got, 

L 2 2" = 7^.(7*7) = Span{(p k , k > 0}, 
= 7^77*) = Span{^ kl k > 0}, 

where 7^.(7*7) stands for the closure 0/7^.(7*7) in L 2 Z (see [12, p-16]) and Span{(pk,k £ N} for the 
set of (potentially infinite) linear combinations of elements 

Proof. We know from [13, §2.3.] that, 

L 2 X = ^(7* 7 ) e ± AA( 7 * 7 ), 
= 7e(77*)e ± AA( 7 7*). 

Therefore, it is enough to show that both null-spaces reduce to the zero element. The kernel 
of 7*7 is constituted by the functions / G L 2 Z that are solutions of 

= 7*7/(6, V£ G 1. 

Deriving four times with respect to £ and applying Lemma 8.1 (see Appendix) leads to /(£) = 0,£ G I. 
So that ^(7*7) = {0}. Now it is enough to notice that AA(7*7) = M (7). However, we know from 
Lemma 8.2 that / G if and only if / G N{~f*) (see eq. (8.1) for notation). Therefore AA(77*) = 

AA( 7 *) = AA( 7 ) = AA(7* 7 ) = {0}, where by Af, we mean {/, / G J\f}. □ 
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6 Explicit computation of (A&), (ifk) and (ipk) 
6.1 Main result 

In this section, we give explicit expressions for the singular bases and singular vectors of the restricted 
call and put operators. The results are gathered below in Theorem 6.1. Let us write 

fk,i(0 = e p ^ B , /k I 2(0 = e-'* C/B , 

fk,3(0 = cos(p k t/B), fkAO = s ' m (Pk£/B), 

where 

(6.1) Pk = ^ + kK + {-l) k fa, k£N, 

and, for all k E N, (3 k is the smallest positive solution of the following fixed point equation in u, 

r m r i % l + cos(n) 

exp(vr/2 + kir + (-l) k u) = . , \ • 

sm{u) 

Interestingly, the positive sequence {(3 k ) decreases exponentially fast toward zero as detailed in Lemma 
6.3. In addition, we write, 

(6-2) h k) \ = + a>k,2fk,2, h k2 = a ky3 f k ^ + a k)4 f k ^, 

where the coefficients a kt i, i = 1, . . . , 4 are such that, 

1 (~l) fc 
V5e^ + (-l) fc ' 

a k ,2 = {-lfe pk a K ' ' 



V^l + (-l)Vft' 
1 

afe,3 = 



11- (-l) k e~P k 
° M ~ 71l + (-l) fe e-^' 

Then, we have the following theorem. 

THEOREM 6.1. The eigenvectors (<p k ) °f 1*1 an d (V'fc) °f 11* are such that 

(6.3) ip k = hk,i + h kj2 , i>h = hk,i - h k>2 . 
They are related by the following relationships, 

(6.4) 7V? fe = X k ip k , Y^k = ^k^Pk, 
where we have written 

(0.5) A fc =(£) 2 , 

and p k is defined in eq. (6.1). They verify ||</?fc||L 2 x = ||V ; fc||L 2 x = 1- Moreover, we have 
(6.6) MB) = ^' k (B) = Q, V*(0) = ¥4(0) = 0, 
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together with 

(6.7) ^ k = (-l)V, 0fe = (-l)Vfc, 

where we have written ipk(0 = ^k(B — £). And finally, we obtain as a direct consequence of eq. (4-1) 
above that 

AfcSfV'fc = d^ip k = <p k , 

Proof. Notice readily that eq. (6.6), eq. (6.7) and the fact that both tp^ and ifi^ are unit normed are 
straightforward consequences of eq. (6.3). In addition, eq. (6.4) is a repetition of Theorem 4.1, item 1). 
So that we are in fact left with proving eq. (6.3) and eq. (6.5). Each eigenvector / of 7*7 associated 
to the eigenvalue r 4 is solution of the problem, 

(6.8) r 4 / = 7*7/, 

for some r / and / G L2X. After differentiating four times the latter equation with respect to £ 
(assuming that / 6 IL,2XnC 4 Z) and applying Lemma 8.1, we obtain that the solutions of eq. (6.8) are 
also solutions of the following fourth order ordinary differential equation, 

r 4 ^ / - / = 0, 

where d| stands for the fourth order ordinary differential operator. Its characteristic polynomial admits 
four roots ±r _1 and ±zr _1 . Consequently, the real solutions of the above ordinary differential equation 
are of the form 

(6.9) /(£) = he^ r + b 2 e~^ r + 63 cos(£/r) + 64 sin(£/r). 

The (fkS are thus of this form. Plugging this generic solution back into eq. (6.8) leads in turn, after 
tedious but straightforward computations, to 

(6.10) Mb = 0, 

where b is a 4 x 1 vector such that b T = \b\ b 2 b% 64] and M is the 4x4 matrix defined by 



3.11) M(r, B) 



r l e B / r —r 1 e B l r r 1 sin(i?/r) —r 1 cos (B/r) 

_ r -2 e B/r _ r -2 e -B/r r ~ 2 cos (B / r) r~ 2 sin (.B/r) 

fy> 3 yi 3 Q iy* 3 

—4 -4 —4 n 

r T r (J 



There exists a non-trivial solution to eq. (6.10) if and only if r is such that the determinant of M 
cancels, that is Det(r, M) = 0. As detailed in Proposition 6.1, the roots of Det(r, M) = are exactly 
the r m = B/v m where v m is defined in eq. (6.16). In addition, we prove in Proposition 6.2 that 
the system M(r m , B)b = admits the unique solution b m . Reading off eq. (6.9), we obtain that the 
eigenvector of 7*7 associated to eigenvalue r 4 ^ writes as a m = T] m \ + r\ m2 where both f] m \ and rj m ^ 
are defined in eq. (6.15). Now, it is enough to notice that, given the properties of the sequence (u m ) 
detailed in Proposition 6.3, r 4 fc+1 = r 4 fc and v\ kJr2 < r 2fc+i' ^ G'N. In addition, we know from Lemma 
6.1 that a2fc+i = o;2fc. This allows us to conclude that the eigenvalues of 7*7 are, without redundancy, 
the A?, k € N, defined in eq. (6.5) and the associated eigenspaces are unit-dimensional and respectively 
spanned by the eigenvectors 92/%, k G N, defined in eq. (6.3). 

Computing ij)f. = X^ 1 Jfk leads, after tedious but straightforward computations to ij)f. = \ — h^^ 
and concludes the proof. □ 



12 



6.2 Additional results 

This section contains a series of results that are used throughout the proof of Theorem 6.1 above. In 
this section we make use of the map E : N i— > N such that E(2k + 1) = E(2k) = k for all k E N. 

PROPOSITION 6.1. Let M(r, B) be the 4 x 4 matrix defined in eq. (6.11). The set of solutions r 
to the problem DetM(r, B) = is countable. Let us denote them by r m ,m E N. For any m £ N, the 
solution r m can be written as 

B 

fm — j 

where v m is defined in eq. (6.16). We obtain in fact that, 

1 + (_!)%) sin (z, m ) 



DetM(r m ,B) = ^ e 



i',, 



COS^n 

Besides, the following relationships hold true 

1 



(6.12) cosi^ 



+ e Um cosh v m 

2 



(6.13) sin % :=-(-l) £ W + (-l) £ W , v . 

1 + e z ^ m 

Proof. It follows from straightforward computations that, 

(6.14) DetM(r,B) = 2e~ B/r ( cos (B/r) (e B/r ) 2 + 2e B/r + cos (B/r 



Let us write v := B/r and notice that if cos(z/) = 0, then DetM(r, B) = 2 7^ so that we must have 
cosv ^ for eq. (6.10) to admit a non-trivial solution. To be more specific DetM(r, B) = reduces 
to P(e u ) = where P(x) := cos (z^) x 2 + 2x + cos ). However the roots of P are given by 

cos(^) 

Henceforth, r = B/u cancels DetM(r, B) if and only if u is solution of anyone of the two following 
fixed point equations, 

u _ -l + sin(i/) v _ -1 — sm{y) 

cos(v) ' cos(^) 
The proof follows now directly from Proposition 6.3. □ 

Proposition 6.2. For any r m solution of the equation DetJ\L(r m ,B^) — (see Proposition 6.1 
above), the null space of M(r m , B) is of dimension 1 and is spanned by the vector 

bm = [bm,l &m,2 &m,3 ^m 4 ] j 



where we have written, 



b 



m,l 



1 (_l)#(m) 

v/Be"-" + (-l) E ( m ) ' 

1 1 



b m , 2 = (-l) E{m) e^a mA 
b m ,3 = 



v^l + (-l) fi We- 

1 



b m ,A — 

and v m is defined in eq. (6.16). 



B 

1 1 - (-l) E ( m ) e ~ u " 

B1+ (-l)E(m) e -u„ 
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Proof. It is a matter of straightforward linear algebra and thus left to the reader. Notice however, that 
it relies on the use of both eq. (6.12) and eq. (6.13). □ 

LEMMA 6.1. Let us write 

(m,l(0 = e Vm * /B , Cm,2(0 = e-^ /B , 

Cm,3(£) = cos(v m £/B), (mAQ = sm(v m £/B), 

where v m is defined in eq. (6.16). In addition, we write, 

(6.15) 7] mjl = 6 TO) lCm,l + bm^Cm.,2, Vm,2 = b m ,3(rn,3 + &m,4Cm,4, 

where the coefficients b m i,i = 1,...,4 are defined in Proposition 6.2. For all k G N ; we have the 
following relationships 

V2k+l,l = V2k,l, V2k+1,2 = V2k,2 

Proof. It follows from straightforward computations using the fact that f2m+i = — v 2m- D 

PROPOSITION 6.3. Let us define the map E : N i-> N such that E(2k) = E(2k + 1) = k for k G N. 

Let us write 

, . — 1 + sinf — 1— sinv 
g[u) = , h(v) = , 

cos V COS V 

and consider the fixed point equations e u = g{v) and e v = h(v). The set of corresponding solutions is 
exhausted by the sequence 

(6.16) v m = (-l) m (I + E(m)7T + (-l) E ^[3 E{m) ) , m G N. 

where (f3 m ) is defined in Lemma 6.3. In particular, notice that V2k+i = ~ v 2k an d WmA < \ v m 2 \ f or °W 
m\,m<i G N such that E(mi) < E(rri2). Notice in addition that, by construction, v m is solution of 

e v m= l + (-l)^sin^ 

COS V m 

This latter result, together with the fact that DetM{B /v m ,B) = (see eq. (6.14)), leads straightfor- 
wardly to the following relationships, 

2 1 

cosv m := — = , 

e" m + e Um cosh u m 

smu m := -(-l) fi W + (-l)^M ^ 

1 + e~ 



2u„ 



Proof. Consider the fixed point equation g(y) = e v . Given the properties of g detailed in Proposition 
6.4, two cases arise depending whether v is positive or negative. In the case where v is positive, the 
exponential map meets q a/t points of the form Pm — 2?n7r — u m for m G N = {0, 1,2, . . .} and 

some small but positive u m s. A direct application of Lemma 6.2 shows that the negative solutions are 
exactly the —p m ,m G N. 

The second fixed point equation h(v) = e v can be rewritten as g(—v) = e v . The positive solutions are 
of the form q m = 5 + 2rmr + v m , m G N. And, from Lemma 6.2 again, the corresponding negative 
solutions are the — q m , m G N. 
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Let us write t m = \ + mix + (— l) m /3 m , m G N. It is clear that £2fc = <7/c and i2fc+i = Pfe f° r k G N. In 
particular, t m is solution of 



(6.17) e 



l + (-l) m sint r 



cos L 



Let us define the map E : N h-> N such that £7(2/% + 1) = #(2fc) = fe for all A; G N. We define v m , m G N 
such that v m = (— l) m i_E( m ), that is f2fc = ifc and z^fc+i = ~~ tki k G N. By construction, z/ m exhausts 
the set of solutions of both fixed point equations e v = g{y) and e v = h(v). In fact, v m is solution of 



l + (-l) s ( m ) sinz/ m 

COS ^ m 

□ 

PROPOSITION 6.4. Notice readily that h{u) = g(—v), so that it is enough to study the properties of 
g alone. We have the following results, 

1. g is defined on the domain T> g = + 2mir,m G Z}; 

2. g is 2-7T periodic and such that, for all v G S g = (— g(V + 2rmr) = g{v); 

3. Finally, g is strictly increasing on S g and such that, 

IX 

lim g(v) = -00, g(-) = 0, lim g(u) = +00. 

"->®-f 2 f-> e 3p 

where we write — ^ (resp. -^-q) to mean the limit from the above (resp. below). 
4- Notice that M\D g (resp. M\T>h) corresponds exactly to the set of all the zeros of h (resp. g). 
Thus T> g V\T>h is the subset ofM. containing all the points where both g and h are well defined and 
different from zero. 

Proof. Let us first focus on the domain of g. It is defined on + mix, m G Z}. However, g can be 

extended by continuity to be worth zero at points 1| + 2mix, m G Z. Notice indeed that for any small 
positive u and i G N, one has got 

,7T , , „ ^ N — 1 + cos u 



0(77 + (-!)*« 



2 v ; ; -(-l)*sin 



!7 



" -(-l)% + 0(u 3 ) v ; 2 

With a slight abuse of notations, we denote the latter extension by g. So that g is actually defined on 
+ 2mir,m G Z}. The other properties are straightforward. □ 

LEMMA 6.2. Recall that T> g and T>h are defined in Proposition 6.4- Notice first that D g n T>h is 
symmetric, meaning that if u G T> g n P/^ i/ien — f G D T>h. For any v G T> g n 77/e /mwe i/ie 
following results, 

1. If v is solution of the fixed point equation e u = g{y\ then —v is also a solution. 

2. If v is solution of the fixed point equation e v = h{v), then —v is also a solution. 

Proof. Notice first that we have the identity h(v)g{v) = 1 for any v G T> g V\T>\ l . Its proof is immediate. 
And therefore, for any v G T> g n T>h solution of e v = g(y), we obtain g(—v) = h(u) = g^u)^ 1 = e~ v . 
And idem for the solutions of e v = h(v). □ 
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LEMMA 6.3. The sequence (/3k) is such that, for all k €N, /3k is the smallest positive solution of the 
following fixed point equation in u, 



exp vr/2 + kir + (-l) k u) = . . , . 

sm(u) 

In addition, the approximation /3k ~ 2e~2~ kn holds true with a large degree of accuracy from k = 1 
onward. 

Proof. Let us write tk = § + kir + (— l) k u, for some small but positive u such that tk is solution of 
eq. (6.17). Notice that 

cos (~ + kir + (-l) k uj = - sin(u) = -u + 0(u 3 ), 
sin (| + kir + (-l) fc u) = (-l) fc cos(n) = (-l) k + 0(u 
exp (| + fcvr + (-l) fc u) = el +fc?r (l + (-l) k u + 0(u 2 )). 

So that eq. (6.17) reduces to 

exp(vr/2 + kit + (-l) fe u) 
Plugging-in the Taylor expansions above, we obtain 



2\ 



fc \ _ 1 + cos(n) 



smut 



+ fc -(l + (-Ifu + 0( U 2 )) = ^±§^ = 1(2 + 0(u 2 )), 

u + C^) w 



which can be rewritten as 

(6.18) u = e-^- k7T (2 + 0(u)). 

It can be verified numerically that 2e~z~ k7T is a very good approximation of /3k as soon as k > 1 in 
the sense that eq. (6.17) holds true with a very large degree of accuracy. □ 

7 The spectral recovery method (SRM) 

In this Section, we first describe how 7 and 7* relate to the bid-ask quotes. We then show that the SVD 
of the restricted pricing operators described above can be used to design a simple quadratic program 
that recovers the smoothest RND compatible with market quotes. 

7.1 From 7 and 7* to call and put prices 

Let us denote by P(£) and C(£) the put and call prices at strike £ and by q the corresponding risk 
neutral density. Let us furthermore write I = R + \I = (B, 00). We assume that the restriction qij to 
the interval I of q is in L2X. For all the following relationships are immediate. 

(7.1) e"-i>(0=7*?(0, 

POO 

e rr C(H)=iq(0+ / (x-i)q(x)dx 
JB 

(7.2) =iq(0 + mi(q)-£m (q), 
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where we have defined, 



(/) = J_x k f(x)dx. 



Notice in particular that 

mo(q) = Q(S T > B) = 1 — m {q), 

mi(g) = E Q (5 r |5 r > B)Q(S T > B) = E Q S T - fhi(q). 

Eq. (7.1) shows that put prices directly relate to the restricted put operator. From an estimation 
perspective, this is a crucial feature that will allow us to recover the RND directly from market put 
quotes. Unfortunately, the situation is slightly different for call prices. As shown from eq. (7.2), call 
prices relate to the restricted call operator via m\{q) and mo(q), which are both unknown. Although, 
they could be estimated and give rise to an estimator of the RND based on quoted call prices, we wont 
pursue this route here, but rather focus on the simpler relation given by eq. (7.1). 

7.2 A refresher on no-arbitrage constraints 

For a detailed review of model-free no-arbitrage constraints, the reader is referred to [21, p. 32, § 1.8] 
and [11]. Let us denote by So the price today of the underlying stock. Let us moreover assume that it 
pays a continuous dividend yield 5. Let us denote by r the continuously compounded short rate and 
by r the time to maturity. Let us recall first that, by no-arbitrage, put and call prices are related by 
the put-call parity. 

(7.3) c(O-P(O = S e- ST -Ce- rT . 
Besides C(0) = Sq and -P(O) = 0. Let us now focus on put prices. We have, 

(7.4) max(0, £e" rT - Soe~ ST ) < P(£) < £e~ rT , 

(7.5) < a e p(0 < 

(7.6) o < a|p(0- 

Assume we are given an increasing sequence of n strikes £i < £2 < ••• < £,n an d a set of corresponding 
put prices mi, . . . ,m n . As described in [2], the above no- arbitrage relationships translate into a finite 
set of affine constraints on the latter put prices. These constraints can in fact be written in matrix form 
as Am < bp, where A stands for a 2n x n matrix, m is the n x 1 vector such that m T = [mi . . . m„] 
and bp is a 2n x 1 vector. More precisely, eq. (7.6) translates into n — 2 constraints as, 

, , , m,-_i_i — m,- rrij+2 — m,_i_i . 

i Am ^ ■= c c - c I ^ : = ^ i = 1, 2, ... ,n - 2 

Moreover, the left-hand-side of eq. (7.4) is fully captured in-sample by adding the following additional 
n constraints, 

(7.7) [Am] i+n -2 := -rrii < - max(0, ^e"' rr - 5 e _<5r ) := [b p ] i+n - 2 , i = l,...,n 

The right-hand-side of eq. (7.4) need not be taken into account at this stage. It is indeed less stringent 
than the upper-bound constraints we will impose in the next section. Finally, given the first n — 2 
constraints, eq. (7.5) reduces to two additional constraints, 

r A 1 m?l — m n-l , —rT rr, 1 

[Am\ 2n -i ■= t _t e := i b phn-i, 

sn sn— 1 



L4m] 2 n := mi - m 2 < := [b 



p\2n- 



Finally, let us recall that if the forward price Po of the underlying stock is observable today, then, by 
no-arbitrage, it must be equal to Soe^ r ~^ T . 
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,Ask 



.Ask 



£o = 



Ask 



S e~ 



V/ 



Figure 2: This graph sums up the set of constraints verified by estimated put prices, which are solutions 
of the quadratic optimization problem described in eq. (PI). Estimated put prices mi, . . . , m n on the 
"dense" grid £i, . . . , £ n are displayed as black dots. They must lie in-between the bid-ask quotes, which 
are represented by thick red dots ranging over quoted strikes £^ , . . . , £j s , which correspond to a sparse 
subset of the underlying dense grid £i, . . . , £ n . In addition, extreme put prices mi and m n are bounded 
above by y^ sk = and y^ sk , respectively, where the value of y^ sk is given in Section 7.3. Both y^ sk 
and y^ sk appear as thick blue dots at strikes £i = and £ n = B, respectively, mi, . . . ,m n must also 
verify the in-sample constraints described by the lhs of eq. (7.4). In particular, the lhs of eq. (7.4) 
ensures that the m^s are lower-bounded by the (£,ie~ rT — Soe~~ Sl ~) + s, which appear as thick blue dots. 
Since this lower-bound is worth for i = 1, this, together with the upper-bound y^ sk = actually 
impose mi = 0. Finally, mi, . . . , m n verify both eq. (7.5) and eq. (7.6) above. The latter constraint 
imposes in-sample convexity. 

7.3 Bid-ask spread constraints 

Let us assume that the market provides us with an increasing sequence of strike prices £i < £2 < 
. . . < £ s , where s typically ranges from 5 to 50 depending on the underlying. In addition, the market 
provides us with a corresponding sequence of bid ask quotes for put options. Let us denote them by 
yi sk , ■ ■ ■ ,yf sk and yf ld , ■ ■ ■ -,yf td - We want the corresponding fitted put prices (m.j) to lie inside the 
bid ask quotes. This corresponds to the following 2s affine constraints, 

(7.8) mi<y? sk , -mi<-y? id , i = l,...s. 

The quoted strikes might possibly span a very small portion of the segment I on which we want to 
recover the RND. In order to improve the quality of our estimator, we can constrain it to verify the 
above no- arbitrage constraints on a denser set of strikes than the quoted ones. Let us denote by 
£1 < £2 < ■ ■ ■ < £n this new set of strike prices, such that £1 = 0, £ n = B and including the initial 
quoted strikes. For later reference, we denote by I = {ix, . . . , i s } the subset of {1, . . . , n} corresponding 
to the indexes of the initial quoted strikes. We know that, in any case, we must have = P(0) = mi, 
so that we can define y^ sk = 0. Furthermore, we know from eq. (7.5) that P(£) cannot grow at a 
rate faster than e _rT , so that we can define y^ sk to be the corresponding linear extrapolation of the 
right-most market quote y^ sk , meaning y^ sk = y^ sk + e -rT (£ ra — £j s ). In summary, the requirement 
that the m^s fall in-between the bid-ask quotes translates into 2s + 2 additional constraints, which we 
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can write as follows 

(7.9) mi<yf sk , ielu{l,n}, 

(7.10) - mi < -yf id , i G /. 

All previously mentioned constraints are summarized in Figure 2. 



7.4 The quadratic program 

Fix N G N. The choice of N will be discussed in the next Section. Let us denote by Pn the estimator 
of the put price P on X built upon the <PkS up to level N and by e~ TT qN the corresponding inverse 
image by 7*. We have explicitly, from eq. (7.1) and Theorem 4.1, item 4), 

Pi * — TT 

N = 7 e q N , 

N 

Pn = y^^Wki 



k=0 



N 



QN 



e rT A fc 1 cj k ipk 



k=0 



for some u T = [uq . . . lon] G M^" 1 " 1 . Furthermore for a given matrix M, we will denote by [M]/ ; j 
the sub-matrix obtained by extracting the rows of M at indexes in I and the columns of M at indexes 
in J. When extracting all the columns, we will write [M]j„ and similarly for the rows. And we will 
naturally write [M]j in the case where M is a vector. The SRM estimator is obtained as a solution 
of a quadratic program. It corresponds (modulo rescaling by the A^s and the discount factor) to the 
coefficients of the smoothest density that verifies the no-arbitrage and bid-ask constraints above. To 
that end, notice that the L^X-norm of the second derivative of qN, namely Sn = ||^|9jv|Il 2 I) quantifies 
its smoothness. 5^v is often used as a smoothness penalty and has been widely used in the context 
of smooth RND recovery. Obviously, the smoother q^, the smaller Sn- As detailed in Proposition 
7.1, Sn can be directly expressed as a quadratic form of u involving the A + 1 first eigenvalues of the 
restricted put operator 7*. As a consequence, cj* is solution of, 



(PI') 



arg 



min \\dlq N \\l 2 



subject to < 



,Ask 



„.Bid 
-VI > 



[PN]lU{l,n} < y?u{l,n}' 

-[Pn]i < 

AP N < h 
Qn(0) 



Pi 
0. 



where, with a slight abuse of notations, we have written Pfi = [P/v(fi) ■■■ ^Mfn)], yf id stands 
for the vector of initial put bid quotes and yfjS^ n \ stands for the vector of initial put ask quotes 
augmented with the no arbitrage bounds y Ask = and y^ sk = yff k + e _rr (^ n — £j s ). Notice that we 
have added the constraint <7at(0) = 0, which does not arise as a natural property of the ip^s. 
Denote by <po,Ar(£) T = fao(£) ••• fNiO] an d 5 similarly, write iPq,n{Q T - Then we have [P/\r]i = 
l Po,N{£,i) T u an d Qn(0 = V ; o,A r (C) T ^A ra; > where Qn is defined below in Proposition 7.1. Let us finally 



denote by $ the matrix whose rows are constituted by the ifo : N(£,i) , i - 
With these notations, eq. (PI') can be rewritten in canonical form as 



1, . . . , n and write = [0]/ , 



(PI) 



1 T n 4 
arg mm —u \l N uj 



subject to < 



0/U{l,n}W 



,Ask 



< 



„.Bid 
-VI > 



'p, 

0. 
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which is nothing but a quadratic program in uj. This result is due to the following Proposition. 

Proposition 7.1. Let us write f N = X^feLo ^& WV'fc anc ^ 
(7.11) fi J v = aa !? (A 1 1 ...,A JV 1 ), 

which stands for the (N + 1) x (A + 1) diagonal matrix whose diagonal entries are the A^ 1 for k = 
0, ... , A. Then 

\\d1fN\\l 2 x = wXw. 

Proof. Notice indeed that dffjy = io T Q,Nd^o,N- However, as demonstrated above in Theorem 6.1, 
d^ipk = 1( fk- Hence, using the property that the ip^s constitute an orthonormal basis of L2X, we 
obtain 

N 
k=0 

□ 

7.5 Properties of eq. (PI) and choice of the spectral-cutoff TV 

A first question that arises is whether this quadratic program eventually admits a solution? In that 
perspective, it is straightforward to notice that eq. (PI) admits a solution if and only if Span{y>i,0 < 
i < A} admits an element which satisfies the constraints. Let us denote by D the subset of L2X which 
satisfies the constraints described in eq. (PI') and assume that P / 8. Obviously, eq. (PI) admits 
a solution as soon as A is large enough, since (ipi) is complete in L 2 X (see Proposition 5.1). On the 
other hand, it admits no solution when D = 0, that is when the constraints are incompatible. This 
latter situation might result from the presence of spurious data, since the presence of an arbitrage in 
the bid-ask quotes corresponds to a real arbitrage in the market, which would certainly be arbitraged 
away by practitioners. 

A second natural question that arises, is how to choose the spectral cutoff A? As detailed in eq. (PI), 
we aim at recovering the smoothest density qjy built upon ipQ,...ipN compatible with price quotes. 
As described in Theorem 6.1, tpk is constituted of a periodic component %2 oscillating at frequency 
Pk/B around an exponential trend where pk grows roughly speaking like k. It is therefore natural 
to think that the smaller A, the smoother the singular basis functions and thus the smoother the 
density gjv built upon them (although this needs not be the case, rigorously speaking). This intuitive 
observation, is justified through simulations (see Figure 5, bottom graph). In practice, we therefore 
suggest to choose N to be the smallest N such that eq. (PI) admits a solution. This is what we 
actually do in the forthcoming simulation study. 

Finally, let us point out that we could have chosen to impose a positivity constraint on qjy at each 
point of the underlying dense grid £1, . . . ,£ n , as an alternative to the in-sample convexity constraints 
on the (TOj)s described in eq. (PI). However, we have noticed via numerical simulations that results 
obtained in that way are less satisfying than with the convexity constraints on the mjS. We therefore 
opted for the convexity constraints. 

8 Simulation study 

We run a simulation study both on real and simulated data. The purpose of the estimation on simulated 
data is mostly to show that the SRM returns a valid RND estimator in extreme cases, when as little 
as 5 market quotes are available. 
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Recall from Lemma 6.3 that, from k = 1 onward, we can write flk ~ 2e"~ fc7r in eq. (6.1) above. This 
approximation is not valid for k = 0. In that case, however, we can solve eq. (6.17) numerically to 
obtain po = 1.875104069. This is the value of po we use in the following simulation study. 

Table 1: S&P 500 put option prices, Jan. 5, 2005. S&P 500 Index closing level = 1183.74; Option 



expiration = 


03/18/2005 (72 days); 


r = 2.69 


%; 5 = 


1.70%. 










Strike price 


500 


550 


600 


700 


750 


800 


825 


850 


900 


925 


Best bid 


0.00 


0.00 


0.00 


0.00 


0.00 


0.10 


0.00 


0.00 


0.00 


0.20 


Best offer 


0.05 


0.05 


0.05 


0.10 


0.15 


0.20 


0.25 


0.50 


0.50 


0.70 


Strike price 


950 


975 


995 


1005 


1025 


1050 


1075 


1100 


1125 


1150 


Best bid 


0.50 


0.85 


1.30 


1.50 


2.05 


3.00 


4.50 


6.80 


10.10 


15.60 


Best offer 


1.00 


1.35 


1.80 


2.00 


2.75 


3.50 


5.30 


7.80 


11.50 


17.20 


Strike price 


1170 


1175 


1180 


1190 


1200 


1205 


1210 


1215 


1220 


1225 


Best bid 


21.70 


23.50 


25.60 


30.30 


35.60 


38.40 


41.40 


44.60 


47.70 


51.40 


Best offer 


23.70 


25.50 


27.60 


32.30 


37.60 


40.40 


43.40 


46.60 


49.70 


53.40 


Strike price 


1250 


1275 


1300 


1325 


1350 












Best bid 


70.70 


92.80 


116.40 


140.80 


165.50 












Best offer 


72.70 


94.80 


118.40 


142.80 


167.50 













8.1 Real data 

We use the bid ask quotes reported in [14, Table 1] for put options on the S&P 500 Index on January 
5, 2005. For completeness, we reproduce the table here in Table 1. We choose B = 2 * Soe( r ~^ T , which 
corresponds to two times the Forward price on the underlying stock. This choice is arbitrary and 
produces an interval I, which is symmetric around the forward price. We observe from our simulation 
that the result is largely independent of the choice of B. However, the higher B, the higher we will 
need to go into the spectrum of 7*, since the smoothest RND that fits the data will be more and 
more concentrated around the center of the interval I. As regards the constraints, we choose the grid 
£1 , . . . , £ n to be such that = k — 1, k = 1, . . . , |_-BJ + 1 and if |_-BJ < B, we add £[bj+2 = B. Of 
course, this grid contains the initial 35 quoted strike prices since they are integer valued. With the 
above choice of B, the quadratic program given in eq. (PI) finds a feasible solution from spectral 
cutoff 66 onward. We report below in Figure 5. For the sake of comparison, we plot on the same 
figure the log-normal distribution obtained by least-square fit to the put prices obtained as average of 
the bid-ask quotes. The only parameter of the log-normal distribution that must be fitted is a (see 
Proposition 8.1), and we find a op t = 0.143. Interestingly, displays a small bump at the beginning 
of its left-tail, which does not appear in [14, Fig. 8] and could hardly be accounted for by parametric 
methods. Notice the small blip next to B in Figure 5. This boundary effect is due to the fact that all 
the ip^s and their first derivative are worth in B. In order to show that the choice of B has very little 
impact, we compute the RND estimator for B = 1.4 * SQe^ r ~^ T . Results are reported in Figure 3. As 
was expected, first feasible points appear at much lower spectral cutoffs, namely from spectral cutoff 26 
onward. Therefore, we plot g*. As can be seen from Figure 4, the put prices P* arising from eq. (PI) 
lie inside the bid ask quotes, while the ones produced by the fitted log-normal density lie outside. 



21 




1500 



800 900 1000 1100 1200 1300 1400 1500 



Figure 3: Here we plot the RND q 2G (solid line) estimated from the real price quotes reported in 
Table 1. We choose B = 1.4 * F = 1.4 * S * e( r_<5 ) r = 1660 for that plot. In addition, we plot the 
best log-normal fit (in a least-square sense) to the average price quotes (dashed line). It is obtained 
for a op t = 0.143. At the top, we display the full left tail of the RND At the bottom, we zoom in 
on the fat left tail of the estimated RND distribution. 
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Figure 4: Here we plot the fitted put prices obtained from the setting described above in Figure 3. The 
solid line corresponds to the fitted prices P*, while the dashed line corresponds to the fitted prices 
obtained from a log-normal distribution. The stars and dots correspond to market ask and bid quotes, 
respectively. At the top, we give a large view of the fits. At the bottom we zoom in to show that P* 
lies inside the market quotes, while the fitted log-normal prices lie outside. 
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Figure 5: Here we plot the RND q e& (solid line) estimated from the real price quotes reported in 
Table 1. We choose B = 2*F = 2*S * e( r_<5 ) r = 2372 for that plot. In addition, we plot the best 
log-normal fit (in a least-square sense) to the average price quotes (dashed line). It is obtained for 
&opt = 0.143. At the top, we display the full left tail of the RND and its full right tail up to B. At 
the bottom, we superimpose (solid line) with (dashed line) obtained in Figure 3 for an other 
choice of B. Notice the strong agreement between both densities, which highlights the stability of the 
SRM with respect to the choice of B. Interestingly, is slightly more bumpy than at the level 
of its left fat-tail. This reinforces our argument that smoothness goes hand in hand with low spectral 
cutoff. 
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8.2 Simulated data 



As regards the simulated data, we work in the Black-Scholes setting. In that context the price of a 
put option admits a closed form solution and the RND is log-normal (see Proposition 8.1). We model 
the bid-ask spread as a random noise around the true price given by the Black-Scholes formula. More 
precisely, for a given set of quoted strikes £1 < • ■ • < £s and corresponding put prices P(£i), • • • , P(£s)> 
we write yf sk = P(&) + %i/2 and yf ld = P(£i) — Zi/2, where zi = max(l, min(3, the £j's are 
iid standard normal random variables and w = 0.1 maxi<j< s P(£i). The bounds 1 and 3 are chosen 
by analogy with the real data quotes in Table 1. Of course, the bid-ask quotes we obtain in that way 
are not arbitrage free. However, they contain the true put price P(£), which, given the nature of the 
quadratic program described in eq. (PI) above, is all that matters to approximate the true RND. For 
the sake of simplicity, we choose r = 0, 6 = 0, r = 1, So = 100, and a = 0.3 and B = 2* Fq = 2* Sq. 
In addition we set a first strike price at [PoJ an d spread the others on its left and right sides at unit 
length distance away from each other until we obtain s strikes. More precisely, the second strike would 
be [Fo\ — 1, the third |_PoJ + 1, the fourth [Fo\ — 2 and so on and so forth. We plot the results for the 
first two spectral cutoff's at which a feasible point is found below in Figure 6 in the case where there 
are as little as s = 5 bid ask quotes and in Figure 7 in the case where there are as many as s = 50 of 
them. In any case, we can see that we obtain a smooth density that resembles the log-normal density 
generating the initial quoted prices and that the estimate is stable from one spectral cutoff to another. 
Of course, the more strikes we have, the better the fit. Besides, we observe as expected from an other 
simulation not reported here that, the smaller the bid-ask spread, the better the fit. Notice once again 
that the fitted right-tail reaches zero in B, while the true one is strictly positive at that point. As 
before, this is due to the fact that ipk{B) = 0. 
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Figure 6: Here we are in the case of 5 simulated bid ask quotes and with B = 2 * F$ = 200. The 
first two plots display q£ and q£ (dashed line), the true log- normal RND used to generate the prices 
(dashed-dotted line) and the orthogonal projection of the true log-normal RND on {-0o> • • • , i^n} f° r 
N = 5 and N = 6 (solid line), respectively. The last two plots display the fitted put prices, that is P* 
and (dashed line) together with the true prices (dashed-dotted line). 
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Figure 7: Here, we repeat the same plots as in Figure 7 in the case of 50 simulated bid-ask quotes. 
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Appendix 

Refresher on the Black-Scholes model 

This is a well-known result of mathematical finance. 

PROPOSITION 8.1. Let us denote by Sq the price today of a stock paying dividends continuously 
over time at a constant rate 5 and by r the continuously compounded risk-free rate. The arbitrage price 
today of a put option on that stock maturing at time r is given by the following closed form formula, 

P(£) = Ze- rT N(-d 2 ) - Soe-^A/X-di), 

with 

, MVP + [(r -5) + \a 2 \r 

where a stands for the volatility of the stock and N for the standard normal cumulative distribution. 
In addition, the RND is log-normal and writes as 

q(x) = V^Tx exp { 27v J • 

Proof. These results can be found in see [21, p. 117], for example. □ 
Additional results relative to 7 and 7* 

We now present three results relative to 7 and 7* , which are either used in the core of the paper or of 
interest in their own right. 

PROPOSITION 8.2. The operators 7 andj* admit no eigenvectors. 
Proof. Suppose / is an eigenvector of 7 associated to eigenvalue A, then denote by 
(8.1) f(t) = f(B-£), 

and notice that for all £ S X, a direct application of Lemma 8.2 allows to write 

Xf(B -0 = A/(0 = 7/(0 = l*f(B ~ 0- 

Thus / must be an eigenvector of 7*. However, it is well known that 7* admits no eigenvalue since, 
for any A 7^ 0, 

A/(0 = 7*/(0 = / 0*(Z, x)f(x)dx, £ G X, 

J 

defines a homogeneous Volterra equation in /, whose unique trivial solution is / = (see [12, p. 239, 
Th. 5.5.2]). □ 

Finally, let us point out the two following useful lemmas. 

LEMMA 8.1. Let us denote by 9| the k th order partial differential operator with respect to £. Then, 
for any f G L2X, we have the following results. 

f = dfrf, f = a| 7 7, 

/ = <9f 7 *7/, / = 3f777- 
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Proof. Notice indeed that 

daf(0 = % j\x - Of(x)dx = -J* f(x)dx, 

£>a* fit) = % [\a - x)f( x )dx = f f(x)d X . 

Jo Jo 

Therefore, we obtain immediately 

/ = of 7/ = a| 7 7- 

The remaining of the proof follows directly from these first results. Notice indeed that, 

^7*7/ = <9f [<9f7*](7/) = %S = /• 
which concludes the proof. □ 
LEMMA 8.2. For any f G L2Z" and £ G X , we have 7 / '(£) = 7* / (B — £) (see eq. (8.1) for notations). 
Proof. Perform the change of variable u = B — x to obtain 

7/(0= ( B {x-t)f(x)dx 



/ ([B-i]-u)f(u)du = 1 *f{B-i). 
Jo 



□ 



Relation between the (ipk)s and the (^fc)s 

We believe that mo(q) and m\{q) could be readily estimated from the data, so that eq. (7.2) could be 
used to construct a second estimator of the RND based on the restricted call operator. This second 
estimator could eventually be combined with the one obtained from the SRM above. To that end, and 
for the sake of completeness, we compute the scalar products between elements of the two singular 
bases. Results are reported in the following proposition. 

Proposition 8.3. Let us write 

Pk, m (x, V) = ("^ + x 2 y)(-l) m+k - xy 2 + y 3 , 
q k ,m(x, V) = (x 3 + x 2 y)(-l) k + (y 3 + y 2 x)(-l) m . 

Then, we have the following relationships, 

.Pk.m(Pk,P m )e~ Pk ~ Pm - qk,m(Pk,Prn)e~ Pk + (\k,m (Prn , Pk)^ Pm + \>k.m(Pm, Pk) , , 

Cpt-/4)(l + (-l) m e-^)(l + (-l)*e-«) ' T ' 

, -e- 2p -{ Pk + 2) + 2 Pk {-l) k e- p x - p k + 2 

{(Pk,Vk) = , — —, 1U „ ■ 

(e P" + {-l) k yp k 
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On the way, we obtain, 
(hk,i, h m s) 

~ [[ ' +[ ' ' (^-/4)(l + (-l) fc e-P»)(l + (-l)™e-^) 

(hk,i,h mi 2) 



,,_ u k , m (fik+p m ){e~^+e^)-{-\)Hp k -p m ){l + e-^+P^) 

[[ ' [ )] (^+/4)(l + (-l) m e-P™)(l + (-l)*e-P») ' ^ 



(hk,i,hk,i) 



1 - e" 2 ^ + 2(-l) fc P fc 



p k (i-i) k + e-p*y ' 

{hk,\,hk,2) = 0, 

{hk,2,h m ,2) — Sk,m ~ (hk,l,hm,l)- 

Proof. Recall that, for all A;, m, we have defined 

fyfe,l = Ofe,l/fc,l + Ofc,2/fc,2, fyfc,2 = afc,3/fc,3 + Ofc,4/fc,4, 

Besides, we have that 

(Vfcj V^m) = <5fc,m = /lm,l) + (/lfc,2, ^m,2) + ^™> 2 ) + (^fc,2j ^m,l)j 

(^k,1pm) = b~k,m = (hk,l,h my l) + (h k ^,h m ^ 2 ) - (h k ^,h my2 ) - (%,2,^m,l)- 

Therefore, we obtain the following relationships, 

<5fc,m = (hk,i,h m: i) + (h ky2 ,h m ^), 

= {hk,l,h„ h 2} + (/lfc,2j ^m,l)- 

Which leads to 

= 2((/lfc 5 i, /l m ,l) — (/lfc,l, h m ^)) — 5k,m- 

Now, it remains to compute (/ifc,i, /i m ,i) and (hk,i,h m p)- The results follow from lengthy and tedious 
but straightforward computations and are therefore not reported here. □ 



From the RND q of S T to the density of In S T 

Some authors have chosen to focus on the estimation of the density of log S T rather than on the density 
of S T itself. Both densities relate by a simple transformation, as described in the following proposition. 
In our case, this transformation can be readily applied since the SRM returns an analytic expression 
for the estimated RND. 

PROPOSITION 8.4. IfX admits f(x) for density on R, then Y = exp(X) admits ^/(lny) for density 
on R + . Conversely, if Y admits f(y) for density on R + , then X = ln(Y) admits e x f(e x ) for density 
on R. 
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